Reversing Moore's Law:
- 2x every 18 months, 1000x in 15 years
- alternatively 1000x cheaper machines
- a million dollar machine today will be a high-end smart phone or a low end laptop
The 2030 Computer
- 18x8 2.5GHz cores
- 45MB L3
- 12TB of RAM (QPI 9.6GT/s)
- 56 PCIe slots
Problems & Constraints
- Tiny cache wrt RAM+PCIe
- Insufficient memory bandwidth
- Usual shared memory shenanigans
- Power consumption
- Largeish CPU die size
Separation of Church and State
- CPUs for coarse data manipulation (e.g. array processing, text manipulation, graphics), cache-oblivious, data parallelism
- CPUs for application control logic & state, should fit in L3 cache
- Communicate via message passing (e.g. Π-calculus, actor model)
Avoiding the Carbon-tax
- Minimalist instruction sets, different for each specialized cpu type
- Streamlined memory access patterns, all memory is caching for persistent storage
- Reduced inter-cpu communication
- Specialized CPUs should idle more often
A Principled Foundation
- λ-, Π-, and ς-calculus mix as assembly language
- Kernel language approach
- Explicit multiple staging in machine code
- Treat each CPU as a remote computer